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Abstract 

A methodology for inferring hierarchies 
representing heuristic knowledge about the 
check out, control, and monitoring sub-system 
(CCMS) of the space shuttle launch process- 
ing system from natural language input is ex- 
plained. Our method identifies failures explic- 
itly and implicitly described in natural lan- 
guage by domain experts and uses those de- 
scriptions to recommend classifications for in- 
clusion in the experts’ heuristic hierarchies. 

1 Introduction 

It is becoming generally accepted that most ex- 
perts organize their problem-solving knowledge 
into a hierarchy of concepts [Gomez and Ghan- 
drasekaran, 1984; Clancey, 1985]. This hier- 
archical organization of knowledge is not ex- 
plicitly used by the experts during the solution 
of problems, but rather is used in an implicit 
form. The task of the knowledge acquisition 
programs is to extract this hierarchical organi- 
zation from the experts by making explicit to 
them the steps they need to visit in arriving 
to solutions. In other words, the goal of the 
knowledge acquisition interface is to make ex- 
plicit the hierarchy of concepts. A well known 
knowledge acquisition methodology to acquire 
hierarchical knowledge from experts is that of 
repertory grids [Boose and Bradshaw, 1988; 
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Boose, et al., 1989; Gaines and Shaw, 1988]. 
The repertory grid methodology elicits catego- 
rizations, called constructs, from the expert by 
asking him/her to rank numerically elements of 
the domain according to how well they satisfy 
a given construct. 

Although this methodology has achieved 
considerable success, the problem of construct 
selection remains one of the most serious bot- 
tlenecks in the repertory grid methodology. If 
the constructs are provided to the domain ex- 
pert by the knowledge engineer, the method 
works reasonably well because the task of the 
domain expert consists of filling in the cells of 
the grid with the appropriate values. However, 
in most cases the key aspect of the knowledge 
acquisition task is the acquisition of the con- 
structs themselves from the domain expert. In 
this regard, elicitation techniques face strong 
limitations due to the fact that the linguistic as- 
pect and contextual knowledge associated with 
the constructs are difficult to handle by elicita- 
tion techniques alone. 

Our own research has been addressing this 
problem by studying the automatic construc- 
tion of constructs or categorizations from nat- 
ural language input. In [Gomez and Segaini, 
1991], the reader may find a description of lin- 
guistic constructions whose underlying struc- 
tures are hierarchical categorizations. In this 
paper, however, we study the problem of infer - 
ring classifications from natural language sen- 
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fences, rather than that of directly mapping 
into hierarchical structures. In order to provide 
some motivation for the problem we are facing, 
Figure 1 contains a portion of the heuristic hier- 
archy acquired from domain experts using our 
present interface. The problem we have expe- 
rienced with our present interface is similar to 
the acquisition of constructs in the repertory 
grid methodology. If a good portion of the 
heuristic hierarchy is provided to the domain 
expert by the knowledge engineer, he/she can 
continue from there without considerable dif- 
ficulty. However, building the hierarchy from 
scratch by the domain expert is a different 
matter altogether. Then, the main idea is to 
ask the expert to describe a given problem (a 
CCMS computer error in our application), in- 
fer some categorizations from the natural lan- 
guage description, and ask the expert to select 
the relevant one(s). This is basically the main 
idea that we explore in this paper in the context 
of the CCMS space shuttle network. 

The remainder of this paper is organized into 
6 sections. Section 2 describes the problem 
domain and our original knowledge acquisition 
interface. Section 3 describes the relationship 
between the interface, the Natural Language 
Component (NLC), and the Classification Sug- 
gestion Module (CSM). Section 4 explains the 
structures passed from the NLC to the CSM. 
Section 5 describes how the CSM infers classi- 
fications. Section 6 provides an overview of the 
NLC. Section 7 gives the aut Lots’ conclusions 
and lists future work to be done. 

2 Automatic Knowledge Ac- 
quisition Interface (AKAI) 

OPERA (Expert System Analyst) is an expert 
system whose task is to improve the operations 
support of the computer network in the space 
shuttle launch processing system at Kennedy 
Space Center[Adler, et ah, 1989]. OPERA 
functions as a consultant to systems engineers 
by suggesting probable causes and recommend- 
ing diagnostic and operational advisories re- 


garding network error messages generated by 
the check out, control, and monitor subsys- 
tem (CCMS). Because OPERA only has in- 
formation on approximately 10% of the 1300 
error messages generated by the CCMS net- 
work, some type of knowledge acquisition tool 
is needed. During the past several years we 
have worked to develop a knowledge acquisition 
interface for OPERA. The result of this effort 
has been the creation of the Automatic Knowl- 
edge Acquisition Interface or simply AKAI. 

It became apparent to us as we worked on 
the interface that while OPERA is not based 
on classification problem-solving, AKAI could 
make use of classification hierarchies [Gomez, 
et ah, 1992a], Two distinct types of classifi- 
cation hierarchies were identified and are now 
used by the interface: heuristic hierarchies and 
factual hierarchies. Heuristic hierarchies rep- 
resent heuristic problem-solving knowledge of 
the domain. Each expert has his/her own ideas 
about how this knowledge is organized depend- 
ing on their personal experience. Factual hi- 
erarchies represent hard or factual knowledge 
about the physical structure of physical ob- 
jects. A factual hierarchy for the CCMS net- 
work was constructed and is currently being 
used by the interface. Because of the static- 
nature of the CCMS network, the factual hier- 
archy is rarely modified. Of primary concern 
to us is the acquisition of the heuristic knowl- 
edge possessed by CCMS experts. Therefore, 
the focus of our research now is acquiring and 
constructing heuristic hierarchies, with the goal 
of AKAI being to acquire probable causes and 
advisories from systems engineers as efficiently 
as possible. 

Towards this goal, user friendly features such 
as pull-down menus, mouse selectable text, and 
a wealth of functions to reorganize the hierar- 
chy were incorporated in AKAI. Beta testing 
revealed, however, that naive users still had 
difficulty during the initial stages of heuristic 
hierarchy construction for the reasons stated 
above. In an effort to address this problem. 
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Figure 1: A Portion of a Heuristic Hierarchy for the CCMS Domain 


we have added a natural language component 
(NLC) and a classification suggestion module 
(CSM). 


3 The Improved Knowledge 
Acquisition Interface 

The operation of the interface, graphically dis- 
played in Figure 2, has changed only slightly 
due to the addition of the NLC and the 
CSM. The NLC is constructed around SNOWY 
[Gomez & Segami, 1989, 1990, 1991]. SNOWY 
is responsible for parsing (determining the syn- 
tactic constituents of the sentence) and inter- 
preting (constructing the logical form of the 
sentence), and then forming (mapping the log- 
ical form of the sentence into SNOWY’s rep- 
resentation language). The NLC' is called by 
the interface during error categorization. At 
this time, the expert is asked to place the er- 
ror message he/she has chosen to describe in 
his/her heuristic hierarchy. During the first 
stages of hierarchy construction there is a good 
chance that the appropriate category for the 


error message currently being described is not 
already in the heuristic hierarchy. In the origi- 
nal interface, the expert was expected to know, 
and was asked for, the name of an appropri- 
ate category'. This was often a problem in the 
initial stages, and the experts caught in these 
situations tended to provide unsound catego- 
rizations. 

The interface has since been enhanced to 
help unsophisticated users add new error cat- 
egories to their heuristic hierarchies. If a user 
is unsure of how to classify an error, he/she 
is asked to provide a short description of what 
he/she knows about the error. This description 
typically consists of two or three sentences de- 
tailing relevant information about the message. 
The text is saved and passed to the NLC. The 
NLC' enlists SNOWY to parse, interpret, and 
form the sentences. If SNOWY can make sense 
of the expert’s description, the output of the 
formation phase is then passed to the C SM. 
The CSM uses the formation output to recom- 
mend categories to the expert. If one or more 
of these recommendations are selected by the 
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expert as acceptable, the problem of classifying 
the error is solved and the suggested error cate- 
gories as well as the error message are placed in 
the expert’s heuristic hierarchy. The interface 
then prompts the user for the probable causes 
of the error message and its operational and di- 
agnostic advisories (this function of AKAI was 
not changed by the addition of the NLO and 
CSM). 

On the other hand, if the expert is not sat- 
isfied with the CSM’s recommendations or if 
SNOWY is unable to understand the expert’s 
description, we may still be able to make a rea- 
sonable suggestion by postponing the classifi- 
cation of the error message until the probable 
causes have been entered by the domain expert 
and examined by the interface. We strongly be- 
lieve that the probable causes represent an ex- 
cellent source of text that is understandable by 
SNOWY and will provide classifications worth 
recommending. Most of the probable cause 
data that has been collected so far is of the 


form “,V has failed,” where X is a component of 
the CCMS network. SNOWY is quite capable 
of understanding sentences in this form. The 
classifications suggested by the CSM for these 
sentences are usually relevant because experts 
commonly use failed component names as cate- 
gory names within their heuristic hierarchies. If 
this process fails, however, the NLC and CSM 
are deactivated and the user falls back on the 
features of the original interface. 

One may then question why the interface 
bothers to ask the user for a textual descrip- 
tion of the error when analysis of the prob- 
able causes appears to provide suitable sug- 
gestions. We have found that additional text 
is needed if we are to make suggestions other 
than failed component suggestions. The sys- 
tem would not be able to make suggestions like 
“initialization failures” or “on line failures” if 
we only called the NLC with probable cause 
text. Classifications of this type are present 
in the heuristic hierarchies of the Grumman 
personnel first consulted to test the interface. 
Therefore, we must provide the interface with 
additional texts which could lead to recommen- 
dations other than failed components. 

The operation of the enhanced interface is 
identical to the original after the error message 
has been placed within the heuristic hierarchy. 
The code of the original interface, therefore, 
was disturbed only slightly, and the users of 
the interface did not need to re learn how to 
operate the system. 

4 Input to the CSM 

Before addressing the details of the CSM we 
must describe the structures which it takes as 
input. The formation phase of the NLC' maps 
the logical form constructed by the interpreter 
into the knowledge representation structures 
of the representation language KL-SNOWY 
through the use of formation rules. The for- 
mation algorithm is called to form clauses as 
they finish the interpretation phase. The most 
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Object Structure 


embedded clause of a sentence is formed first, 
the second most embedded is formed second, 
and so on until the main clause is formed. 
The structures, called object structures and 
relation structures, are used by the CSM to 
make recommendations. Together these two 
types of structures form the kernel of KL- 
SNOWY. There is a significant advantage to 
having SNOWY apply its formation phase to 
the logical form produced by the interpreter. 
This will become apparent during our discus- 
sion of the Classification Suggestion Module in 
section 5, if one understands the structures of 
the representation language. Therefore, it is 
important that the semantics of object and re- 
lation structures is clear. 


4.1 Object Structures 

Object structures represent knowledge about 
physical and abstract objects. Some physical 
objects are trains, tools, mountains, geese, etc., 
and some abstract objects or ideas are sets, 
states, properties, and relations. Conceptual 
relations representing knowledge about the ob- 
ject are represented as slots in the object struc- 
ture frame (see the box surrounding the object 
structure for CPU in Figure 3). 

These relations will either describe the ob- 
ject in some way or attribute actions to it. In 
the CPU object structure example, the slot 
“(process (data ($tnore (@a3))))” represents 
an action attributed to the concept CPU, and 
“(made-of (silicon ($more (@a2))))” represents 
a description. The relation structure names, 
@a3 and @a2, point to relation structures that 
contain additional information which is not 
stored directly under CPU but elsewhere in 
SNOWY’s long-term memory (LTM). In gen- 
eral, concept relations are represented in object 
structures as: 

relation (<9al) monadic 

relation (conceptl (Sal)) diadic 

All concepts must have a unique name in 
memory so that the knowledge about them can 


CPU 

(is-a (electrical-component)) 

(part-of (computer ($more (@al)))) 
(made-of (silicon (Smore (@a2)))) 

(process (data ($more (@a3)))) 

@a3 

(instanc.e-of (action)) 

(args (CPU) (data)) 

(pr (process)) 

(actor (CPU (q (?)))) 

(theme (data (q (?)))) 

Figure 3: A Portion of the Concept ( PU 
Acquired by SNOWY from Natural Language 
Input. 

be integrated in a single place. Therefore, we 
need a method for dealing with concepts which 
are not explicitly named in the sentence. An 
example of such a sentence is “The adapter in 
the FEP returned an invalid status.” The sub- 
ject of the sentence, “the adapter in the FEP," 
is a complex concept which must be given a 
dummy name (agensym) to uniquely identify it 
in LTM. The structure is called an x-structurc. 
We use a characteristic-features slot to specify 
the necessary and sufficient conditions describ- 
ing this new concept. For this complex concept, 
the representation would be: 

(xl (cf (is-a (adapter)) 

(part-of (FEP)))) 

The meaning of this is that the x-structure 
xl is a sub-class of adapter, whose members all 
have the feature of being a part of a front-end 
processor (FEP). This feature is “characteris- 
tic” because it is shared by every member of the 
class xl. Complex concepts can arise from nat- 
ural language constructs such as existentially 
quantified sentences, complex noun phrases, 
and restrictive qualifiers (relative clauses and 
prepositional phrases). 


205 



4.2 Relation Structures 

Relation structures represent knowledge about 
instances of conceptual relations. Each struc- 
ture contains a verbal concept, its cases and 
their fillers, the quantification of each filler, an 
instance-of slot indicating whether the relation 
is a description, action, proposition (embedded 
relation) or cf-structure, and an optional truth- 
value slot which indicates whether the relation 
is believed to be true or false by SNOWY. In 
the absence of a truth-value slot the statement 
is taken as true by default. For example, the 
relation structure, that represents “CPUs 
process data” is shown at the bottom of Figure 
3. 

The first slot, instance-of . indicates that &aS 
is an instance of an action relation. The args 
slot lists the arguments of the relation. If the 
relation is monadic, the args slot will contain a 
single concept. If the relation is diadic, as is the 
case in this example, the args slot contains two 
concepts, and so on. The pr slot contains the 
verbal concept or primitive. Following the ver- 
bal concept are its thematic cases. Each case is 
filled by a “quantified” concept from the argu- 
ment list. The quantifier of an argument is the 
filler of its q sub-slot. In @a3 y both the quan- 
tifiers for CPU and data are unknown, repre- 
sented by a question mark. This reflects the 
fact that from the statement “CPUs process 
data” it is not clear if all CPUs process all data 
or only some CPUs process some data. Other 
possible fillers of the q slot are: most, many, 
all, cardinal adjectives, and numerals. 

Creation of relation structures is normally 
handled by the formation algorithm. This al- 
gorithm constructs structures from the logical 
form by collecting the thematic cases identi- 
fied by the interpreter for sentence clauses. In 
certain sentences, however, the formation al- 
gorithm must be overridden or postponed be- 
cause the verbal concept requires an unusual 
construction to be formed. To handle these 
special cases, we use formation rules which are 
briefly discussed in section 6. 


5 The Classification Sugges- 
tion Module (CSM) 

The task of the Classification Suggestion Mod- 
ule (CSM) is to take the output from the for- 
mation phase of SNOWY and produce a list of 
error message classifications that can be sug- 
gested to the user. To accomplish this task, the 
CSM scans the output of the formation phase 
of SNOWY looking for certain constructions 
that are likely to lead to plausible suggestions. 
The CSM looks for the following constructions: 
negated relations and relations that indicate 
failures, descriptive relations which explicitly 
or implicitly indicate failed components, and 
complex noun phrases describing failed compo- 
nents. After a set of suggestions is identified, 
the CSM attempts to prioritize them based 
upon an analysis of the expert’s heuristic hi- 
erarchy. This prioritized list of suggestions is 
then presented to the expert. Additionally, if 
the expert selects one or more of the sugges- 
tions, the CSM will attempt to engage the ex- 
pert in a dialog whose purpose is to elicit more 
information. The sections below discuss each of 
the constructions relevant in identifying possi- 
ble suggestions, the prioritization task, and the 
elicitation of additional information. 

5.1 Relation Structures 

The CSM identifies relation structures contain- 
ing negated verbal concepts or with verbal con- 
cepts that indicate failures. Consider for ex- 
ample the formation of the sentence “The FEE 
failed to detect an acknowledgement from the 
i/o adapter,” which contains a negated verbal 
concept. 

The CSM scans the formation output for 
relation structures, such as @a‘27 below, and 
examines their truth-value slots. If the truth- 
value slot indicates that a verbal concept is ex- 
plicitly negated, as become- aware is in the 
example below, we save the relation structure. 
The system can then use the cases of these 
structures to generate plausible classification 
suggestions (see the following section). 
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@a27 

(truth- value (f)) 

(args (fep) (acknowledgement)) 

(pr (become-aware) ) 

(actor (fep (q (constant)))) 

(theme 

(acknowledgement (q (?)))) 
(instance-of (proposition)) 

Example 1 

Verbal concepts that implicitly indicate fail- 
ures are also identified. In the sentence, “The 
option plane microcode crashed,” the verb 
crashed indicates a failure. This is immediately 
obvious to the CSM because of the verbal con- 
cept that crash is mapped to during formation. 

option-plane-microcode 
(is- a (microcode) ) 

@a30 

(args (option-plane-microcode)) 

(pr (fail)) 

(actor 

(opt ion-plane -micro code 
(q (constant)))) 

Example 2 

The verb rules for the verb crash map it to 
the verbal concept fail. Other verbs which are 
mapped to the verbal concept fail are break , 
collapse , and fail. Because SNOWY is able 
to determine the underlying meaning of these 
verbs, the CSM has an easy time selecting 
negated relations and relations indicating fail- 
ures. 

5.2 Case Roles as Plausible 
Classifications 

Some of the cases of these relation structures, 
such as actor, theme, at-loc., and at-time, 
lead to plausible classifications. In Example 
1 above, the relation structure @a*27 has two 
case slots: the actor case, filled by fep , and the 
theme case, filled by acknowledgment. These 


two cases suggest two possible error message 
classifications. One possible classification is the 
class of error messages generated by “fep fail- 
ures”. Because all the relation structures se- 
lected by the CSM denote failures, the actor of 
each relation represents a component that has 
failed to accomplish some task. 1 That failed 
component may also be responsible for gener- 
ating other error messages. Therefore, it makes 
sense to recommend a class of error messages 
caused by the failed component. For this ex- 
ample, the (ISM would save the classification 
“fep failure” as a possible classification to be 
recommended to the expert. 

Another possible classification is “acknowl- 
edgement failures”. 2 This supports the notion 
that the theme case of failure relations may 
lead to plausible classifications, when the orig- 
inal sentence is a “fail to” construction. In 
the sentence “The common data buffer failed 
to update the system configuration table,” the 
theme case, filled by “the system configuration 
table,” may potentially represent a category of 
errors. While the actor case represents “what 
failed, the theme case describes the component 
that failed to be acted upon. Consequently, 
one might think that the theme case is not as 
likely a source of classifications as the actor 
case. We can, however, conceptualize a class 
of error messages which indicates the failure of 
some component to update the system config- 
uration table. Each member of the class would 
have similar operational advisories instructing 
systems engineers in how to handle the failed 
update. Therefore, the CSM saves the theme 
case fillers of negated relations as possible clas- 
sifications. 

At-time cases can also lead to plausible 
classifications. These cases indicate when a 
failure occurred, which may be very signif- 

1 We must recognize that if the expert describes fail- 
ures of irrelevant components, the system will make nec- 
essarily irrelevant recommendations which the expert 
may ignore. 

2 These failures are so common they are referred to 
as NOAOKs. 
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icant. For example, consider the sentence 
“The FEP failed to respond during initializa- 
tion.” The prepositional phrase “during initial- 
ization” tells us that the failure occurred during 
the process of initialization. In general, if the 
filler of the at-time case is a process, we rec- 
ommend that filler as a possible classification. 
For the example above this gives “initialization 
failures”. It is our belief that the fillers of the 
at-time case should almost always be processes. 
This is because it makes little sense to use a 
time NP (a noun phrase specifying a time) ex- 
cept in certain situations. 3 

At-loc cases can lead to plausible classifica- 
tions. For example, “The transmitter/receiver 
failed in the HIM” is a sentence in which the at- 
loc case, filled by “the HIM,” represents a pos- 
sible category of error messages. Because the 
failure occurred within the HIM, we can infer 
that the transmitter/receiver is located within 
the HIM and therefore may be a sub-part of 
the HIM. The HIM, which is the larger object, 
is likely to have other sub-parts which may fail. 
This means that the class of “HIM failures” is 
likely to be a good category of error messages. 
One should note that the object and its sub- 
part(s) form a part-of hierarchy. Discussion of 
how part-of hierarchies can be used to help pri- 
oritize suggested classifications can be found in 
section 5.5, Part-Of Hierarchies . 


3 In most cases, we would not expect to see a sentence 
with an at-time case filled by a time NP, such as, “The 
FEP failed to respond to the HIM at 10 pm.” Obviously 
the expert giving such a description does not realize that 
he/she has described a specific error event, while what 
we are after is a more general description of the error. 
However, it may make sense to write, “The FEP fails to 
respond to the HIM during the winter”. This sentence 
would lead to the classification, “winter failures,” which 
seems plausible. In the cases where the at-time filler is 
a time NP, the OSM asks the expert, “Is this the only 
time that this error occurs?” If the expert responds with 
an affirmative answer, the system retains the filler as a 
possible classification. 


5.3 Descriptive Relations and Noun 
Phrases 

Concepts that have negative properties may 
lead to plausible classifications. If the ex- 
pert mentions a defective component within 
his/her error message description, that compo- 
nent is likely to contribute to the error. The 
CSM identifies descriptive relations that indi- 
cate faulty components, as in “the i/o adapter 
is not operational,” “the HIM may be down,” 
or, “the HIM is unable to reset the status reg- 
ister.” In these cases, the predicate adjective is 
examined to see if a failure is present. Predi- 
cates that explicitly or implicitly describe nega- 
tive properties of network components provide 
strong indications that the components they 
modify have failed. Explicitly negated predi- 
cates are those that clearly indicate a negation, 
either by inclusion of the adverbs not and no, 
or through the use of negative prefixes. Some 
examples of explicitly negated predicates are 
abnormal , unable , disabled , uninhibited , and in- 
capable . Important features, such as negative 
prefixes, are stored in a lexicon for each word. 
For example, the word abnormal has the fol- 
lowing feature: 

abnormal 

(neg-pref ix (normal) ) 

The n eg- prefix slot tells us that abnormal 
contains a negative prefix affixed to the root 
word normal. 

The representation of descriptive relations 
that denote negated properties is exactly the 
same as the representation of negated actions 
discussed in an earlier section. For example, 
the output from the formation phase for the 
sentence “the i/o adapter is abnormal” is 

<3a39 

(truth-value (f)) 

(args (i/o-adapter) (normal)) 

(pr (has-property ) ) 

(descr-subj (i/o-adapter 
(q (constant)))) 

(descr-obj (normal (q (?)))) 
(instance-of (description) ) 
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Notice that the descriptive relation has- 
property is negated. The meaning of the rela- 
tion structure, @>a29, is “the i/o adapter does 
not have the property of being normal.” The 
CSM can determine that this structure denotes 
a negative property by examining the truth- 
value slot in search of an “f\ A more difficult 
sentence to handle would be “the i/o adapter 
is not abnormal.” In this case, the formation 
phase realizes that there is a double negation. 
The final structure, therefore, will not have a 
truth-value slot filled by “f\ and we will not 
recommend “i/o-adapter failures” as a category 
of error messages. 

Some predicates may indicate a failure or 
negation but are not explicitly negated. Ex- 
amples of this type of predicate are defective , 
down, and broken. In these cases, the meaning 
of the predicate adjective is needed if we are 
to determine that a failure has occurred. Cur- 
rently, a sub-hierarchy within SNOWY’s LTM 
maintains knowledge of properties. 

The CSM also identifies complex noun 
phrases that indicate faulty components, as 
in “the defective HIM...” or “the failed data 
bus....” This is accomplished by examining the 
x-struc.tures of complex noun phrases for nega- 
tive properties. If the x-structure of a complex 
noun phrase has a negative property, the CSM 
will save the super-concept of the x-structure 
as a possible classification. From the sentence 
“All further polling will cease pending com- 
ponent fault isolation of the failed HIM,” we 
would like to recommend “HIM failures” as a 
possible classification. The relevant portion of 
the representation provided to the CSM by the 
formation phase is 

xl 

(cf (is-a (HIM)) («a41)) 

Qa41 

(args (xl) (defective)) 

(pr (has-property) ) 

(descr-subj (HIM (q (constant)))) 


(descr-obj (defective) ) 


Defective indicates a failure so the CSM 
saves the super-concept of xl, HIM, as a possi- 
ble classification. 

5.4 Prioritizing Recommendations 

Once a set of candidate classifications has 
been determined from a sequence of text, the 
CSM orders the candidates from highly recom- 
mended to least recommended. .Several order- 
ings are possible. 

• If it can be determined that the user’s 
heuristic hierarchy is structured based 
upon component /sub-component relation- 
ships, then failed components should be 
highly recommended. 

• If it can be determined that the user’s 
heuristic hierarchy is structured based 
upon process / sub-process relationships, 
then verbal concepts that represent pro- 
cesses or at-time slot fillers which are pro- 
cesses should be highly recommended, e.g., 
“the microcode fails during initialization,” 
or, “the microcode failed to initialize.” 

• If nothing about the user’s hierarchy can 
be determined, then fall back on the struc- 
ture of the factual hierarchy which is 
a structural one, i.e., failed components 
should be highly recommended. 

By prioritizing the classifications, the most 
relevant classifications (determined heuristi- 
cally using the rules above) can be presented 
to the expert as such. This helps when the set 
of possible classifications is large. 

5.5 Part-Of Hierarchies 

There may also be a hierarchical relationship 
between several of the candidate classifications, 
especially when the candidates are selected 
from text describing probable causes. For ex- 
ample, the probable causes for error 141 are: 
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FEP 


^/part^of 

l \ 

part-of \ part-of 

FEP I/O 
Adapter 

FEP Option 
Plane 

FEP T/R 


Figure 4: FEP Part-Of Hierarchy 


1. FEP i/o adapter failed. 

2. FEP option plane failed. 

3. I/O adapter port on the 

4-port controller failed. 

4. FEP transmitter/receiver 

failed . 

This leads to the failed component hierar- 
chy shown in Figure 4. Probable causes 1, 
2, and 4 describe the failures of sub-parts 
of a FEP. Determining that the FEP is re- 
lated to i/o adapter, option plane, and trans- 
mitter/receiver by the has-part relation is the 
job of the noun-noun interpretation algorithms 
contained within SNOWY. That is, SNOWY 
is responsible for determining the meaning of 
complex noun phrases such as “FEP option 
plane,” which is, “an option plane that is part 
of a FEP.” The CSM simply has to look for 
a part-of relation under each of the sub-parts 
to recognize that a hierarchy exists. The ex- 
istence of a part-of hierarchy provides strong 
evidence that the root of the hierarchy should 
be an error message category. In fact, it makes 
sense for the system to recommend the entire 
hierarchy to the expert. 

5.6 Eliciting Additional Information 

Up to this point, we have discussed how the 
NLC understands natural language input and 
how the CSM uses that understanding to iden- 
tify and prioritize relevant categories of errors 
for presentation to the expert. The knowl- 
edge acquisition task does not end, however, 
when the expert selects a suggested classifi- 


cation. When experts accept suggested clas- 
sifications, the CSM will “keep them talking” 
by prompting them with questions designed to 
trigger their recall of additional error message 
classifications. These questions prompt the ex- 
pert for the names of similar messages that they 
feel would fall under the suggested category. 
The CSM also asks the user for other categories 
of errors that may be similar to the suggested 
category. 

6 An Overview of the Natural 
Language Component 

The NLC is an application of SNOWY. 
SNOWY is a system which integrates problem 
solving, knowledge acquisition, and informa- 
tion retrieval. In [Gomez & Segarni, 1989] it 
was shown that, “in order for SNOWY to un- 
derstand text, it needs to start with a minimum 
set of concepts which categorizes the world into 
states, actions, collections, etc.” This a priori 
set of concepts, or ontology, is organized into 
a hierarchy based upon is-a relationships. The 
hierarchy is part of SNOWY’s LTM. This LTM 
maintains the information that SNOWY has 
gathered from natural language input. 

Each sentence presented to SNOWY under- 
goes three phases: a parsing and interpretation 
phase, a formation phase, and a recognition 
and integration phase. Because the recognition 
and integration phase is primarily concerned 
with updating SNOWY’s LTM, which is unnec- 
essary for our task, we only call upon SNOWY 
to parse, interpret, and form the expert’s nat- 
ural language input. These three processes are 
described below. 

Parsing a sentence involves identifying its 
syntactic structures. The parser used by 
SNOWY is called WUP, which stands for word 
usage parser [Gomez 1989]. The underlying 
philosophy of WUP is that the syntactic us- 
ages of words play a greater role in parsing than 
is generally admitted. Discussion of how the 
usages are implemented and the details of the 
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operation of the parser will not be conducted 
in this paper, however. For our purposes, it 
is enough to know that the parser identifies 
the syntactic categories of the sentence. The 
syntactic categories used by WUP are: sub- 
ject, verb, object, indirect object, prepositional 
phrase (PP), predicate, subordinate clause, and 
conjoined clause. The parser is not responsible 
for determining the attachment of prepositions, 
the verbal concept underlying the main verb, or 
the meaning of complex noun phrases. That is 
the duty of the interpreter. 

6.1 Interpretation 

The interpretation process is responsible for 
constructing the logical form from the syntac- 
tic constituents identified by the parser. This 
logical form represents the semantics of the sen- 
tence independent of any context. As each con- 
stituent of a sentence is identified, it is sent 
to the interpreter. It is important to point 
out that a constituent need not be interpreted 
the first time that it is seen by the interpreter. 
In fact, there are many cases where the inter- 
pretation of a particular constituent must be 
postponed until all the constituents of the sen- 
tence have been read. The constituent could 
be a noun phrase representing the subject or 
object of the sentence, in which case the in- 
terpreter must determine the meaning of the 
noun phrase. If the constituent is a verb, the 
interpreter must determine the underlying ver- 
bal concept that the verb represents. If the 
constituent is a prepositional phrase, the inter- 
preter must determine its attachment and its 
meaning. Each of these three types of interpre- 
tation has its own set of interpretation rules. 
We will discuss each of the three types of in- 
terpretation and then culminate the interpre- 
tation section with a discussion of how this fits 
in with the domain at hand. 

6.1.1 Noun Phrases 

Interpreting noun phrases requires a great deal 
of knowledge of the meanings of nouns and ad- 
jectives. This is evident in the noun phrase 


“arthropod legs,” which is the subject of the 
simple sentence “Arthropod legs are jointed.” 
We can make sense of this phrase only be- 
cause we know very well that arthropods, such 
as spiders and crustaceans, have legs. This 
knowledge allows us to determine that the NP 
above means “the legs that are part of arthro- 
pods.” Without any knowledge of arthropod or 
leg we would be unable to determine a rela- 
tionship between these two nouns. Similarly, 
knowledge of the adjective “wooden” is nec- 
essary to determine the meaning of the NP, 
“wooden legs,” which is “legs made of wood.” 
SNOWY stores knowledge of nouns as rela- 
tions under their corresponding LTM concepts 
in SNOWY’s concept hierarchy. Knowledge of 
adjectives is stored as interpretation rules. The 
noun phrase interpretation algorithm uses this 
knowledge when considering each pair of items 
in a given NP. 

6.1.2 Verb Rules and Verbal Concepts 

The interpreter algorithm makes use of verb 
rules to establish the underlying verbal con- 
cepts of sentences. These verbal concepts rep- 
resent the meaning of the verb in the sentence. 
Below are the verb rules for the verb dump : 


A Portion of the Verb Rules for the 
Verb Dump 

(dump 

(((dump) (dumps) (dumped) (dumping) 
(has dumped) (had dumped)) 

(Cobj 

((if part-of obj computer) 

(primit ive-is transfer-data) 
(semantic-role-of -is obj 
from-loc))))))) 

The “obj” slot contains a verb rule which will 
be tried when the parser passes the object con- 
stituent to the interpreter. This rule chooses 
the verba] concept transfer-data and marks 
the object as filling the from-loc case of this 
verbal concept in the event that the object of 
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the sentence is a part of a computer. There- 
fore, this rule would be used when SNOWY is 
interpreting a sentence like “Dumping the CPU 
registers would help isolate...” The interpreta- 
tion of the main clause of this sentence would 
be “somebody transfer-data from the CPU reg- 
isters (from-loc) to some unknown location (to- 
loc).” 

While verbs like dump have very clear mean- 
ings, other verbs can be quite ambiguous. The 
verb go is reported to have 63 different mean- 
ings [Hirst, 1992]. We can side step this prob- 
lem in most cases, however, because the domain 
of the incoming natural language is restricted 
to OOMS network error message descriptions. 

Once the verbal concept has been deter- 
mined, the interpreter attempts to fill the the- 
matic cases of the verbal concept. Interpre- 
tation is now said to be driven by the verbal 
concept in the sense that we will attempt to 
place each of the other constituents within its 
framework. Thematic cases or roles show how 
noun phrases are related to the verbal concepts 
of sentences. Some of the most common the- 
matic cases used by KL-SNOWY are: actor, 
theme, instrument, at-loc, from.loc, to-loc, at- 
time, init-time, end-time, descriptive-subject, 
and descriptive-object. 


6.1.3 Prepositional Phrases 

Interpreting prepositional phrases involves se- 
lecting the proper attachment (what sentence 
constituent is modified by the prepositional 
phrase) and establishing the meaning of the 
modification. Meanings and attachments are 
established by the verbal concept and interpre- 
tation rules under the given preposition [Gomez 
et ah, 1992b]. Verbal concepts claim prepo- 
sitional phrases through preposition rules (P- 
rules) stored under them. Noun phrases claim 
prepositional phrases through P-rules stored 
under the preposition. 


6.1.4 Interpretation in the CCMS 
Network Domain 

While interpretation of arbitrary text is cur- 
rently an open problem, we can use the fact 
that we know the domain of the incoming dis- 
course and the task of the OSM to limit the 
scope of the interpretation so that it is man- 
ageable. For instance, we have found that a 
significant percentage of the noun phrases used 
in error message descriptions and in the prob- 
able causes indicate specific components of the 
CCMS network. 4 The following is a table of 
some of the most common noun phrases in this 
domain: 

Table 1: Common Noun Phrases in the CCMS 
Domain 

active cpu 
common data buffer 
error message 
ground data bus 
FEP option plane 
GSE data bus 
GSE microcode 
HIM status register 
system config table 

The semantics of these noun phrases can be 
captured by a few noun phrase interpretation 
rules. For instance, the phrase “data bus is 
taken to mean a “bus for transporting data,” 
where in this case bus is not a vehicle which 
makes frequent stops, but is a physical struc- 
ture for transporting data and control infor- 
mation. Because we know the domain of the 
natural language input we will simply ignore 
the vehicle meaning of bus. A rule stored un- 
der the concept “data” will build the following 
interpretation when the noun phrase is inter- 
preted: 

(bus (transport (data))) 

Another rule will look for part-of relation- 
ships between the nouns in noun phrases. This 

4 The components may be hardware components or 
software programs and data structures. 


i/o card 
data bus 
GSE FEP 
LDB FEP 
PCM FEP 
standby cpu 
i/o adapter 
option plane 
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rule captures the meaning of phrases like “GSE 
data bus,” “GSE FEP,” “GSE microcode,” and 
“FEP option plane.” The interpretations of 
these four phrases are: 


in the CCMS domain, it was necessary to add 
them to SNOWY’s a priori hierarchy. Table 4 
is a partial list of the concepts that were added 
to SNOWY’s LTM. 


((bus (transport (data))) 

(part-of (GSE))) 

(FEP (part-of (GSE))) 

(microcode (part-of (GSE))) 
(option-plane (part-of (FEP))) 

Of course, to determine these part-of rela- 
tions we must know a priori the physical struc- 
ture of the CCMS network. This a priori in- 
formation has been assembled by a knowledge 
engineer and is stored in AKAI’s factual hier- 
archy. Therefore, we can determine these re- 
lations simply by consulting the factual hierar- 
chy. 


Table 4: New Concepts added to LTM 


acknowl edgement 

adapter 

board 


buffer 

bus 

card 

computer 

cpu 

FEP 

HIM 

i/o 


LDB 

microcode 

option-plane 

PCM 

register 

signal 

switch 

transceiver 

transmission 

uplink 


Verb rules need to be provided for the verbs 
commonly used in error descriptions and prob- 
able causes. A list of the verbs, for which verb 
rules were added, is given in Table 2. Each of 
these verb rules must specify a verbal concept. 


Table 2: Verbs needing New Verb Rules 


activate 

command 

detect 

dump 


fail 

generate 

initialize 

isolate 


poll 

reset 

respond 


Table 3 lists the new verbal concepts created 
for this domain. 


Table 3: New Verbal Concepts 


activate 

command 

become-aware 

fail 

fail-negation 

generate 


initialize 

isolate 

poll 

reset 

respond 

transfer-data 


6.1.5 Formation Rules 

Formation rules are stored under verbal con- 
cepts. When the formation algorithm is ac- 
tivated, it looks to see if the verbal concept 
selected by the interpretation process has any 
special formation rules stored under it. 5 If 
formation rules are found, the normal forma- 
tion algorithm is overridden and the system 
attempts to fire them. If a rule fires success- 
fully, its consequent list is evaluated, effectively 
taking over the task of formation. Let us now 
discuss an example of a formation rule written 
by the authors to handle a special construction 
used in the CCMS domain. 

Negated relations may come from sentences 
which use the “fail to” construction, or from 
sentences with explicitly negated verbs. The 
“fail to” construction is one in which the main 
verb of the main clause is fail and fail is followed 
by an embedded clause beginning with the 
word to. The representation of sentences using 
this type of construction is a relation structure 
representing an embedded clause whose verbal 

^This discussion assumes that the interpreter was 
able to determine a verbal concept. In the event that no 
verbal concept was selected, the formation phase will be 
unable to construct a relation structure and is aborted. 


Because the verb rules and verbal concepts 
added to SNOWY are dependent on knowing 
the LTM categories for nonns commonly used 
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concept has been negated. The formation rule 
responsible for creating this structure, shown 
below, is stored under the verbal concept fail- 
negation in the f-rules slot. 

(fail-negation 

(is-a (description) ) 

(subj (thing (descr-subj) ) ) 

(obj (proposition (descr-obj ) ) ) 
(f-rules 

(fire-all 

( t (negate-relat ion) ) ) ) ) 

This rule calls a LISP function, called 
ncgatc-rdation , to negate the embedded clause. 
Take for example the sentence “The FEP failed 
to detect a response from the i/o adapter.’' We 
would like to end up with KL-SNOWY struc- 
tures that represent that the FEP did not be- 
come aware of a response from the i/o adapter. 
Therefore, the task of the negate-relation func- 
tion is to place an / in a tvuth-vciluc slot of the 
relation structure associated with the embed- 
ded relation “[FEP] detect a response from the 
i/o adapter.” 

7 Conclusions and Future 
Work 

We have shown how natural language input can 
be used to infer classifications suitable for inclu- 
sion into the heuristic hierarchies of AKAI, in 
a real world environment. We are currently in 
the early stages of the implementation of these 
ideas. Very little work needs to be done on 
the NLC, per se, because SNOWY is a working 
system. The bulk of our effort is, therefoie, fo- 
cused on implementing the OSM. Nevertheless, 
there are several data files used by SNOWY 
that must be scaled up if the enhancements of 
AKAI are to work “outside the lab.” 

One such data file is SNOWY’s lexicon. To 
address this problem, a machine-readable dic- 
tionary created by the Summer Institute for 
Linguistics, called Englex, is being adapted for 
use by SNOWY. Specifically, entries in Englex 


are being converted into a format assimilable 
by SNOWY and added to SNOWY’s lexicon. 
Englex contains morphological data for approx- 
imately 11,000 nouns, 4000 verbs, and 4400 
adjectives, as well as adverbs, acronyms and 
abbreviations, proper nouns, prepositions, de- 
terminers, conjunctions, quantifiers, etc. Es- 
pecially useful are markers indicating negative 
prefixes and nominalizations for nouns. By in- 
corporating these words into SNOWY s lexi- 
con, we hope to minimize the problem of en- 
countering unknown words during the parsing 
of an expert’s description. 

Other data that will need to be expanded 
are SNOWY’s verb rules and verbal concepts, 
interpretation rules for interpreting complex 
noun and prepositional phrases, and new for- 
mation rules for handling special sentence con- 
structions. At first glance this task may seem 
quite daunting, but because we are receiving 
natural language input constrained to the do- 
main of CCMS network messages, we can ex- 
pect a limit to the diversity and complexity of 
the incoming text. This claim is supported by 
an analysis of the text that makes up the prob- 
able causes and advisory data currently stored 
in OPERA. 

While extension of the NLC involves data, 
work on the OSM requires coding changes. It 
is important to note that the complexity of 
implementing the OSM is significantly reduced 
by the robustness of SNOWY’s representation. 
Determining failures and their related cases is 
a simple task, assuming that SNOWY has been 
able to create the appropriate structures. This 
underscores the importance of an adequate rep- 
resentation for the purpose of acquiring knowl- 
edge. 
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